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"VIDEO ENCODING METHOD" 

FIELD OF THE INVENTION 

The present invention generally relates to the field of data compression 
and, more specifically, to a method of encoding a sequence of frames, composed of 
picture elements (pixels), by means of a three-dimensional (3D) subband decx)mposibon 
involving a filtering step applied, in the sequence considered as a 3D volume, to the 
spatial-temporal data which correspond in said sequence to each one of successive 
groups of frames (GOFs), these GOFs being themselves subdivided into successive pairs 
of frames (POFs) including a so-called previous frame and a so-called current frame, 
said decomposition being applied to said GOFs together with motion estimation and 
compensation steps performed in each GOF on saids POF^ and on corresponding pairs 
of iow-frequency temporal subbands (POSs) obtained at each temporal decomposition 
level. 

The invention also relates to a computer programme comprising a set of 
instructions for the implementation of said encoding method, when said programme is 
carried out by a processor included In an encoding device. 

BACKGROUND OF THE INVENTION 

In recent years, three-dimenslbnal (3D) subband analysis, based on a 3D, 
or (2D+t), wavelet decomposittor^ of a sequence of frames considered as a 3D volum 
has been more and more studied for video compression. The coefllcients generated by 
the wavelet transform constitute a hierarchical pyramid in which the spatio-temporal 
relationship Is defined thanks to 3D orientation trees evidencing the parent-offspring 
dependencies between coefficients, and the in-depth scanning of the generated 
coefficients in the hierarchical trees and a progressive bitplane encoding technique lead 
to a desired quality scalability. The practical stage for this approach is to generatB 
motion compensated temporal subbands using a simple two taps wavelet filter, as 
Illustrated In Hg.l for a group of frames (GOF) of eight frames. 

In the illustrated implementation, the input video sequence Is divided into 
Groups of Frames (GOFs), and each GOF, itself subdivided into successive asuples of 
frames (that are as many inputs for a soolled l^otion-Compensated Temporal RItering, 
or MCTF module). Is first motion-compensated (MC) and then temporally filtered (TF). 
The resulting low frequency (L) temporal subbands of the first temporal decomposition 
level are further filtered (TF), and the process may stop att:er an arbitrary number of 
decompositions resulting In one or more low frequency subbands called root temporal 
subbands (in the illustration, an ecample with two decomposition levels resulting in two 
root subbands is presented). In the example of Fig.l, the frames of the illustrated group 
are referenced Fl to F8, and the dotted arrows cori^pond to a high-pass temporal 
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filtering, while the other ones correspond to a low-pass temporal filtering. Two stages of 
derompositfon are shown (L and H = first stage ; LL and LH = secxDnd stage). At each 
temporal decomposition level of the Illustrated group of 8 frames^ a group of motion 
vector fields Is generated (In the present example, MV4 at the first levels MV3 at the 
second one). 

When a Haar multiresolution analysis is used for the temporal 



decomposition, since one motion vector field Is generated between every two fiames in 
the considered group of frames at each temporal decomposition level, the number of 
motion vector fields is equal to half the number of frames in the temporal subband. I.e. 
four at the first level of motion vector fields and two at the second one. Motion 
estimation (ME) and motion compensation (MC) are only performed every two frames 
of the input sequence (generally in the forwanJ way), due to the temporal down- 
. sampling by two of the simple wavelet filter. Using these very simple filters, each low 
finequency temporal subband (L) represents a temporal average of the input couples of 
frames, whereas the high frequency one (H) contains the residual error afl:er the MCTF 



Unfortunately/ the motion compensated temporal filtering may raise the 
problem of unconnected picture elements (or pixels), which are not filtered at all (or 
also Hie problem of double-connected pixels, which are filtered twice). The number of 
unconnected pixels represents a weakness of a 3D subband codec approaches because 
it highly impacts the resulting picture quality (particularly in occlusion regions). It Is . 
especially true for high motion sequences or for final temporal decomposition levels, 
' where the temporal correlation Is not good. The number of thse unconnected pbcels . 
depends on the dense motion vector field that has been generated by the motion 
estimation. 

Current criteria for optimal motion vector search used in motion estimators 
do not take Into account the number of unconnected pixels that will be the result of , 
motion compensation. Most sophisticated algorithms use a rate/distortion criterion 
which tends to minimize a cost function that depends on the displaced dlflference 
energy (distortion) and the number of bits spent to transmit tlie motion vector (rate). 
For e>«mple, the motion search returns the moh'on vector that minimises : 

J{m) = SADis, c(m)) + Ji^onoN ' ^(m -p) (1) 



witti m = {m^.myf being tiie motion vector, p = (p^.Pyf being tfie prediction for tfie 
motion vector, and ^mqiton being the Lagrange multiplier. The rate term R(m^p) 
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represents the motion information only and SAD Is used as distortion measure. It is 
cx>mputed as : 



with s being the original video signal, c being the cod^ video signal and B. being the 
blodc size (note that B can be 1). Unfortunately, these algorithms do not talce into 
account the distortion introduced by unconnected pixels during the inverse motion 
compensation because usually these optfmizabons are applied to hybrid coding for which 
the inverse motion compensation Is not performed. 

SUMMARY OF THE INVENTION 

It Is therefore an object of the Invention to avoid such a drawback and to 
propose a video encoding method in which the set of unconnected pixels is talcen into 
account in the distortion measure. 



frames, composed of picture elements (pixels), by means of a three-dimensional (3D) 
subband decomposition involving a filtering step applied, in the sequence a>nsidered as a 
3D volume, to the spatial-temporal data whidi correspond in said sequence to each one of 
successive groups of frames (GOIrs), these GOFs being themselves subdivided into 
successive pairs of frames (POFs) Including a so-called previous frame and a so-called 
current frame, said decomposition being applied to said GOF^ together with motion 
estimation and compensation stepis performed in each GOF on salds POFs and on 
corresponding pairs of low-frequency temporal subbands (POSs) obtained at each temporal 
decomposition level, this process of motion compensated temporal filtering leading in the 
previous frames on the one hand to connected pixels, that are filtered along a motion 
trajectory corresponding to motion vectors defined by means of said motion estimation * 
steps, and on the other hand to a residual number of soolled unconnected pixels, that are 
not filtered at all, each motion estimation step comprising a motion search provided for 
returning a motion vector that minimizes a tost function depending at least on a distorsion 
criterion involving a distortion measure, said measure distorsion beTng also applied to the 
set of said unconnected pixels. 

BRIEF DESCRIFTION OF THE DRAWINGS 

The present invention will now be descn*bed, i>y way of example, with reference to the 
accompanying drawing in which : 

- Rg.l shows a temporal multiresolution analysis with mob'on compensation. 




afa1,y=1 



(2) 



To this end, the invention relates to a mdhod of encoding a sequence of 
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DETAILED DESCRIPTION OF THE INVENTION 

Because unconnected pixels highly partidpate to the qualtty degradation of 
the inverse motion compensated image, the set of unconnected pbcels fs, according to the 
invention, taken into account In the distortion measure. To this end, it is here proposed to 
introduce a new rate/distortion criterion that extends equation taking into account, the 
unconnected pixels phenomenon. This is illustrated in equations (3> and (4) : 
/C{m) = J(m) + ^uAfcvmBCTB? ' ^{^UA/com£C7W (3) 
K(m) = SADis,c{m)) + ^moomECiw ' ^i^mcomEOB^i^))'^ ^mouon /JCm-p) (4) 

Di^imcoNNECTED^^^ being the distortion measure for the set S^^am^iEo 
unconnected pixels resulting from motion vector m . Several distortion measure can be 
applied to the set of unconnected pixels. A very simple measure is preferably the count of 
unconnected pixels for the motion vector under study. 

It can be noted that the real set of unconnected pixels resulting from a motion 
search can be computed only when the motion vectors 'infomiation is available for the 
whole frame. Tlierelbre, an optimal solution can hardly be achievable (in fact a complex set 
of minimisation criteria for the whole frame shoukl be solved), and a sub-optimal 
Implementation Is therefore proposed. 

This Implementation is not recursive and can be considered as a simple way to 
take into account the distortion due to unconnected pbcels. For a given part of the image to 
be motion ompensated (a part of the image can be a pixel, a block of pixels a . 
macroblock of pixels or any region provided that the set of parts covers the whole Image 
without any overlapping) and for a given motion vector candidate m , a temporary inverse 
motion compensation is applied, the set of unconnected pixels b identified and 
^(^mvcoaw£cted(™)) ca" be evaluated. The current K{m) value can be computed and 
compared to the current minimum value (m) to check if the candidate motion vector 
brings a lower K{m) value. When all the candidate have been tested, tiie (final) Inverse 
motion compensation is applied to the best candidate (identilying connected and 
unconnected pixels). The next part <rf the Image can then be processed, and so on up to a 
complete processing of the whole Image. 
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. CLAIMS: 

1. A method of encoding a sequence of frames, composed of picture elements 

(pixels), by means of a three-dimensional (3D) subband decomposition Involving a filtering 
step applied, in the sequence considered as a 3D volume, to tiie spatial-temporal data 
which ojrrespond in said sequence to each one of successive groups of frames (GOFs), 
these 60l=s being themselves subdivided into successive pairs of frames (POFs) including a 
so-called previous frame and a so-called current frame, said decomposition being applied 
to said GOi^ together vtfith motion estimatjon and compensation steps perfonned in each 
60F on saids POFs and on corresponding pairs of low-frequency temporal subbands (POSs) 
obtained at each temporal decomposition level, this process of motion compensated 
temporal filtering leading In the previous frames on the one hand to connected pixels, that 
are filtered along a motion trajectory conesponding to motion vectors defined by means of 
saki motibn estimation steps, and. on tiie other hand to a residual number of so-called 
unconnected pixels, tiiat are not filtered at all, each motion estimation step comprising a 
motim search provided for returning a motion vector that minimizes a cost function 
depending at least on a dIstorsiOn criterion Involving a distortion measure, said measure 
distorsion being also applied to the set of said unconnected pixels. 
2. An funding meUiod according to dalm l,,ln which sakl motion search 

provided for minimizing the following expres^n (1) : 



witii m = im^,myf being tiie motion vector, p = {p,,p,Y being tiie prediction for ttie 
motion vector, X^^n being Uie Lagrange multiplier, the rate term .^(in-p) 

representing ttie motion InfonnaUon only, and SAD used as distortion measure being 
computed as : 

B.B 



&lD(j.c(m))= 2;|j[x.j;]-c[x-»i„j);-wJ| (2) 



witii s being tfie original video signal, c being ttie coded video signal and B being tiie 
hlock size, and in which tiie distorsion criterion extends equation (1), teking into account 
tfie unconnected pixels phenomenon for the minimizing operation that is now applied to 
the following expression (3) : 

K{jn) = J(in) + ^UNOONNEOB) ' ^i^tMcomecTW ("»)) (3" 
or A'(m) = S4Z7(5,c(m)) + ;i,,j,ca««ECTH>-^^^ (4) 
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with D{Su^coNNEciEDi^)) bs'ng the dFstortfon measure for the set 5't;jj^co/w£C7H> of 



uiKonnected pixels resulting from the motion vector m . 

3. An encoding m^od aooonflng to daim 2, said method Including, for taking 

Into account the distortion due to the unoonniected pixels, the following steps, successively 
applied to each part of the whole Image to be motion-compensated : 

-(•»>fer-the-eonsIdeFed-part-oF-the-lmage^nd-for-a-gIven-motion-veetor 



candidate m, a temporary inverse motion compensation is applied ; 

(b) the s^ of unconnected pbcels is Identified ; 

(c) D(SuNooNNecrEo (m)) Is evaluated ; 

^0 W *e current K(m) value is computed and compared to the current minimum 

value Kmin(m) to check if the motion vector candidate brings a tower K(m) value ; 

(e) when all the candidates have been tested, a finafinverse motion 
compensation fs applied to the best candidate ; 

(f) the steps (a) to (e) are then applied to the next part of the image that can 
15 be similarly processed, said part of ttie image being a pixel, a block of pixels, a macroblock 

of pixels or any region provided tiiat ttie srt of parts covers the whole image witiiout any 
overiapping. 
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Abstract 

The inventfon relates to a method of encoding a sequence of ftames, 
composed of picture elements (pixels), b/ means of a threedimensibnal (3D) subband 
decomposftfon involving a filtering step applied, in the sequence considered as a 3D 
volume, to the spatial-temporal data which correspond In said sequence to each one of 
successive groups of frames (60F=s), and to a non-recursive Implementation of said 
method. TTie GOFs are themselves subdivided Into successive pairs of frames (l>0R5) 
including a so-called previous frame and a so-called current frame, and the decomposition 
is applied to said GOF=s together with motion estimation and compensation steps performed 
in each GOF on saids POFs and on corresponding pairs of low-fi-equenc/ temporal 
subbands (POSs) obtained at each temporal decomposition level. The process of motion 
compensated temporal filtering leading in the previous frames on the one hand to 
connected pixels, that are filtered along a motion trajectory con^ponding to motion 
vectors defined by means of said moUOn estimation steps, and on the other hand to a 
residual number of so-called unconnected pixels, that are not filtered at ail, each motion 
estlmaHon step comprises a motion search provided for returning a motion vector that 
minimizes a cost funcUon depending at least on a distoision cnterion, said criterion taking 
into account the unconnected pixels phenomenon for the minimizing operation. 

Fig.l 
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